Note: We are aware that two packages with a same name already exists ( https://github.com/hadley/r2d3, https://github.com/jamesthomson/R2D3 ), however what we are proposing is a completely different thing and this particular name will suit it the most. Hadley’s library is no longer maintained and is only a D3 wrapper for ggplot2 and James Thomson’s library wraps JS code in R functions. None is submitted to CRAN officially. We are also open to change the name to avoid confusion in community.

Applicants: Filip Stachura, Marek Rogala, Olga Mierzwa-Sulima, Krzysztof Wróbel
on behalf of Appsilon Data Science. Email: hello@appsilondatascience.com

Short Description

Data driven documents in R without predefined wrappers. With this package you can create any visualisation possible in D3 directly with R language.

The problem and motivation

Problems that data scientist needs to visualise are often very complex and require more advanced solutions. It’s natural to look for a better data visualisation methods in R. Although there are many greatly supported R packages that allow to create plots (ggplot2, plotly, rbokeh), when it comes to non-standard visualisations like network graphs, bubble or radial charts you have to look for something created by the community.

If you know JavaScript (JS) you might think about wrapping a D3 component into htmlwidget and use it in your RMarkdown report or Shiny app. Unfortunately only some D3 visualisation methods are implemented in R by community, for instance: NetworkGraphs( http://christophergandrud.github.io/networkD3/ ), BubbleChart (https://github.com/jcheng5/bubbles), CoffeeWheel ( https://github.com/armish/coffeewheel ), etc. . What’s more only some of them generate interactive visualisations, other allow for image or SVG generation. So when it comes to customisation they are very limited, time-consuming and often difficult to do without an appropriate JS knowledge. Since not many data scientists know JS, for the sake of their work, they often resort to manual editing of produced SVG files or static images in Graphics editors (e.g. Adobe Photoshop). In other cases they order visualisations from D3 specialists.

Either way, currently it’s difficult, time and cost consuming to create truly interactive and advanced visualisations in R.

Value Proposition

So in order to tackle the problems of customised, interactive and pleasant data visualisations we are proposing a standardised package wrapping D3, which will allow data scientists to recreate any data visualisation possible in D3 directly with R language without any knowledge of JS. With a D3-like syntax anyone would be able to create visualisations available in D3 among others like those below.

In a longer term it will lead to much cleaner code in R scripts and Shiny apps, due to not mixing too much JS with R. Also it will be less likely to run into compatibility issues, which you can have when integrating various R packages based on D3. Hopefully it will also be possible to unify current standards for wrapping D3 plugins and components for using them where it’s necessary: especially in RMarkdown reports and Shiny apps.

Last, but not least all the necessary ingredients for nice visualisations in one R package will be unified under a single package with well designed standards. That gives a better chance of strong community support through years of its development.

At Appsilon Data Science we have a broad experience of both JS and R and in our day-to-day job we often had to tackle those problems. Our customers often require non-standard data visualisations and most of them we had to recreate in D3. Since it’s also a common problem in R community we have decided to wrap up our experiences and release it as an open source solution. In order to do it with highest possible standards and quality we are applying for R Consortium grant.

We think this project aligns well with the R Consortium’s goals:

Solution

We solve this using htmlwidget . Thanks to that users can use D3 in RMarkdown reports and reactive Shiny apps. The structure of the visualisation (plot, graph, etc.) is described in R and later on interpreted in the browser. As the code is interpreted it might be adapted for compatibility with different API’s of D3 (there are breaking changes between v3 and v4).

We have implemented a Proof of Concept (POC) that allows for creation of various plots and animations, which is hosted on target Github page: https://github.com/appsilon/r2d3. Creating this has already eliminated biggest risks of this project. We managed to prove that general implementation concept is possible. We already know that 90% of D3 plots can be easily implemented purely in R and we believe that we can make it 100%. At the same time we want to make our Domain Specific Language (DSL) usage as convenient as possible. We understand that this is going to play crucial role in adoption of this library.

The biggest challenge that will have to be faced is debugging.

Currently debugging in POC is not easy. We want to provide a way of exception reporting and handling. Hopefully we will be able to report an error with an appropriate link to documentation explaining the proper usage of methods (like it’s done in React currently).

We do not want to play with D3 internals, so D3.js related problems will be redirected to it’s Github Issue page. However, in case of unexpected problems during project’s time-frame we will be in touch with a D3 community.

During POC we have created a few design assumptions for future package development:

All those assumptions are going to be discussed with R community in the 1st phase of this project before going into 2nd phase of implementation. This will bring the best standards and practices to this library.

Example 1: Basic bar chart

Example 2: Sphere animation

Using r2d3 creating animations feels natural as it does in D3. No need to animate frames. Examples in this gist is completely created with R and usable:

Current POC code is not perfect, but still we have created fairly complicated animations directly in RStudio watching them within the Viewer. It feels awesome and we want more.

Example 3: Easily extending D3 with plugins

One of crucial requirements we have in mind is to allow for easy extending of visualisations through D3 plugins. As an example we’ve chosen awesome Textures.js library. Below you can see how we’ve implemented an interactive scatter plot showing mtcars data using r2d3 extended with textures.

Important challenge here is to track right version of libraries dependencies and resolve them correctly.

Html source: https://gist.github.com/filipstachura/c926abd54ff7225043d5ee7ca354af6e

R source: below

The plan

We are going to divide this project in 3 basic phases:

Phase 1 Design and validation - approximately. 1 month

We are going to start a discussion on Github, when it comes to the most important design aspects mentioned above:

We are going to release information about our project’s design discussion on multiple R-related social channels like:

We will also try to implement 10 examples D3 specific visualisations with current POC version of r2d3 to potentially identify other undetected issues.

Phase 2 - Core package implementation - approximately 3-4 months

During this time we are going to finalise the implementation of this package and release it to CRAN:

Phase 3 - Package maturity tests - approximately 1-2 months

During this phase we are going to find 2-4 beta testing partners, who are willing to use our library commercially. We are going to support them during integration process testing the maturity of our package in the process. Gathered feedback will be later used to improve the package.

To ease learning curve for less experienced R users we are also going to create few working samples, which will be ready-to-paste for usage in theirs projects and publish them in our repository.

How Can The ISC Help?

We estimate scope of the project to take about 6 months of work. We are going to need some additional time resources and thus we are going to hire data scientist, who will be dedicating his/her efforts solely to this project (a funding of $12400). During 3rd phase to support beta testers phase we will need extra $2000. This funding will allow us to make sure that this open source project have same priority as other commercial projects we work on and have the best possible quality.

At the same time we want to spark the adoption of this package through conferences. If accepted we believe it would make sense to present r2d3 package during EARL conference in San Francisco (US location) and during UseR in Brussels (Europe location). Our estimations are that both of them are going to cost $3000 ($2000 and $1000 respectively).

We also apply for separate, smaller budget for marketing of this package. This would cover the costs of series of blog posts showing potential of using D3 straight from the R. We want to write 6 separate articles with total cost of $800.

In our opinion the biggest impact ISC can have on this work would be to contact us directly with data scientists from R Consortium member companies. Thanks to that we can take ‘customer development’ approach, gathering requirements from that and going through beta tests directly with them. This can also lead to very successful case studies we can share with others on how to use r2d3 to deliver business results.

Summing it up:

Total: $18,200 (can be paid in a few instalments)

Dissemination

We are going to open-source this project on Github using MIT license: https://github.com/appsilon/r2d3.
We will be using r-hub for testing compatibility on different operating systems and machines. As stated before, by the end of the 2nd phase package will be released to CRAN.

Besides two mandatory blog posts to R Consortium blog, we are going to use our Appsilon Data Science Blog to report significant progress in this package. During the 3rd phase we want to publish there a few posts about creating nice visualisations with r2d3 (Radial charts, 3D earth visualisations etc. ). It’s worth mentioning that our blog’s RSS feed is also connected to R-Bloggers, which gives it quite high community visibility. Extra adoption boost could give us a hosted post on RStudio and Microsoft’s VSTR social channels, which are members of R Consortium. It would be nice to use social media channels of other members as well, if this package sides well with their business or marketing strategies.

We are active on a few R-specific Data Science groups on Slack and FB, so we are going to publish updates there as well. From our past experiences we know that can get a lot of valuable feedback from community there.

This year we are going to participate in useR! and EARL, so depending on proposal acceptance date we might be able to give r2d3 presentation during this years editions. Otherwise we are going to give this talk on useR! and EARLs in 2018, both in Europe and USA. We will also make sure to promote this project on R events in Poland.